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(57) Claim 

1. A synthetic regulation region for the expression of 

heterologous genes in E. coli, which contains a promoter, a 
modified lac operator and a ribosomal binding site, having 
one or more of the following features: 

a) a spacer group of 15 to 18 base-pairs is located 
between the -35 and the -10 regions, and 

b) a spacer group of 6 to 14 base-pairs is located 
between the ribosomal binding site and the ATG 
itart codon 

and optionally: 

c) the -3 C region in the promoter h.s the following 
nucleotide sequence (codir~ str.od) 

TTGACAT or CTTGACAT, 

d) the -10 region in the promoter has the fc: lowing 
nucleotide sequence (coding strand) 

GTATA/.T. 
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In the preparation of eukaryotlc polypeptides 1n 
bacteria, 1 n particular E. coll, by genetic engineering, 
the -heterologous" gene which codes for the desired 
eukaryotlc polypeptide Is Incorporated 1n a suitable vec- 
tor, and this hybrid vector Is then Introduced Into the 
bacterial host. However, a nu.ber of conditions .ust be 
■et for this heterologous gene to be able to bring about 
the production of the desired polypeptide. An essential 
condition 1s a functioning regulation region, consisting 
of an operator, a pro.oter and a so-called "Shine-Da Iga r- 
no sequence", also called an SO sequence or, simplifying 
(since only the corresponding sequence of the mRNA binds 
to the ribosome), called the "ribosoaal binding site" 
below. 

Correct expression of the gene, that 1s to say the 
production of the desired polypeptide, 1s conditional on 
recognition of the pro.oter by the bacterial host. The 
enzyme RNA polymerase which is Intrinsic to the host 
recognizee a part sequence 1n the ONA of the promoter and 
binds to this part sequence. This brings about an opening 
of the double-stranded ON A in this region, whereupon the 
synthesis of the mRNA on the coding strand (transcription 
strand) starts. 

The operator, which frequently overlaps with the 
promoter, 1s recognized by a repressor protein which is 
Intrinsic to the host. The frequency of transcription is 
controlled by the more or less effective binding of this 
repressor protein to the operator. This system is affec- 
ted by inducers (inducer molecules) which bind the repres- 
sor protein and thus activate the operator. 

Finally, the ribosoaal binding site Is responsible 
for the Initially synthesized part of the mRNA containing 
an RNA sequence for the binding to the ribosome on which 
the translation into the polypeptide takes place. 
" Thus the regulation region Is responsible for the 

expression of the gene to give the- desired polypeptide via 
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25 



30 



tht transcription stage (transcription of DNA Into aRNA) 
and the subsequent translation. Apart from the sequence 
of nucleotides, an Important point about the structure of 
a regulation region of this type 1s the geometry, that 1s 
to say the spatial arrangement of the promoter, operator 
and ribosomal binding site. 

In the following text, the numbering of the 
nucleotides relates to the site at which transcription 
starts (zero), counting being, as usual, from 5' to 3 1 . 

Natural promoters for E. coll RNA polymerase have 
two regions of DNA sequences which are preserved. One is 
the -35 region and the other 1s the -10 region, also 
called the "PHbnow-Scheller box", the numbering being 
based on the abovementloned nucleotide numbering, that is 
to say these regions are located upstream of the start of 
transcription. 

The regulation sequence according to the 
invention represents a modification of the natural 
regulation sequences and ensures optimal binding of the 
RNA polymerase to the promoter and effective utilization 
of the operator. The regulation sequence according to 
the Invention can either be placed directly upstream of 
the heterologous gene, whereupon the desired polypeptide 
Cwlth methionine at the amino terminal end) is 
expressed, or a bacterial gene - 1n whole or in part - 
Is interpolated upstream of the heterologous gene, this 
leading to expression of a fusion protein In which (a 
portion of) the bacterial protein 1s bonded to the amino 
terminal end of the desired polypeptide. 

The synthetic regulation region according to 
the invention for the expression of heterologous genes in 
E. colt, containing a promoter, a modified lac operator 
and a ribosomal binding site, comprises one or more of the 
following features: 

a) the «M rtp 4 ^" 4 » » K » planter hit thi fnMnwinn 
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a) a spacer group of 15 to 18 base-pairs is located 
between the -35 and the -10 regions, and 

b) a spacer group of 6 to 14 base-pairs is located 
between the ribosoroal binding site and the ATG 

5 start codon 

and optionally: 

c) the -35 region in the promoter has the following 
nucleotide sequence (coding strand) 

TTGACAT or CTTGACAT , 
10 d) the -10 region in the promoter has the following 

nucleotide sequence (coding strand) 

GTATAAT. 

Other embodiments of this invention include the 
following: 

15 a regulation region wherein the spacer region has 

15 base-pairs (bp) and/or is A-T rich. A regulation region 
wherein the spacer group is between the ribosome binding 
site (r.b.s.) and the start codon has 10 bp and/or is A-T 
rich. A regulation region wherein the r.b.s. is rich in 
20 purines. A regulation region wherein the modified lac 
operon has the DNA sequence I or I la as herein defined. 

Apart from the advantages already mentioned, the 
regulation region according to the invention is 
distinguished by being very variable and, because there is a 
25 number of unique restriction sites, it allows the individual 
elements, namely the promoter, operator and ribosomal 
binding site, to be cut out and combined with known systems. 
Horeover, by modification of the spacer groups, it is 
possible to vary the geometry and suit it even better to the 
30 circumstances of the individual case. 

The regulation region according to the invention is 
preferably constructed by complete synthesis. It is 
possible to use the known DNA synthetic methods for this 
purpose, for example the phosphite method. 
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DNA sequence I (Appendix) shows an advantageous 
embodiment of the total regulation region. DNA sequences 
Ha to Ilh show specific, preferred embodiments of sequence 
I. For the synthesis of sequences I and II, a few 

5 nucleotide pairs which permit attack by restriction enzymes 
for "cutting to size" are additionally attached to the 5' 
and 3' ends in each case. DNA sequence Ila is also given in 
complete detail, three nucleotide pairs being placed on each 
of the 5' and 3' ends. Obviously, it would also be possible 

10 to attach other or more nucleotide pairs to these. 

In the Examples which follow, specific embodiments 
of the invention are illustrated in detail, from which the 
large number of possible modifications and combinations is 
evident to those skilled in the art. Unless otherwise 

15 specified, percentages data in these Examples relate to 
weight. 
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Exaeple 1 
Synthesis of OHA sequence Ila 

a) Chemical synthesis of a single-stranded oligonucleotide 

The synthesis of the structural units of the gene 
1s Illustrated by the example of structural unit la of the 
gene, which comprises nucleotides 1-19 (plus three others 
at tht 5' end to alio* attack by Bam HI) of the coding 
strand. Using known methods (A. J. Salt et ml.. Nucleic 
Acids Res. 8 (1980) 1081-1096), the nucleoside located at 
the 3* end, that Is to say In the present case thymidine 
(nucleotide no. 19), 1s covalently bonded via the 3'- 
hydroxyl group to silica gel ( <*> FR ACTOSIL, supplied by 
Merck). For this purpose, first the silica gel Is reacted 
with 3-(tr1ethoxys1lyl)propylam1ne with elimination of 
ethanol and formation of an S1-0-S1 bond. The thymidine 1s 
•W reacted with the modified carrier 1 .» the presence of 

para nit rophenol and H,M'-d1 eyclohexyl ca rbodl 1m1 de, the free 
carboxyl group of the sucdnoyl group acylatlng the amino 
radical of the propylamine group. 

In the following synthetic steps, the base compo- 
^•20 nent Is used In the form of the dlalkylamlde or chloride 

of the monomethyl ester of the 5 «-0-d1 methoxyt rl ty Inucl eo- 
slde 3'-phosphorous acid, the adenine being 1n the form 
of the N 6 -benxoyl compound, the cytoslne being 1n the 
form of the M*-benxoyl compound, the guanine being 1n the 
form of the H 2 -1aobutyryl compound and the thymidine, which 
contains no amino group, being without a protective group. 

100 mg of the polymeric carrier which contains 
4jimol of bound thymidine are treated consecutively with 
the following agents t 
30 •) nltromethant 

b) saturated xlnc bromide solution In nltromethane con- 
taining 1X water 

c) methanol 

d) tmtrmhydrofuran 
35 e) acetonltrlle 

f> 80 umol of the appropriate nucleoside phosphite and 
400 umol of tetrexole 1n 1 ml of anhydrous acetonlt- 
rlle (5 minutes) 
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g) 20X acetic anhydride 1n t et ra hy drof u ra n containing 40X 
lutldlne and 10X dlmethy lan1 nopy r1 dl ne (2 minutes) 

h) tet rahydrofuran 

1) t et rahydrofuran containing 20X water and 40X lutldlne 
5 j) 3X Iodine 1n co I U d1 ne/wa t er/ t et ra hy drof u ra n In the 
ratio by volume of 5:4:1 
k) tet rahydrofuran and 
I) Methanol. 

In this context, the tern "phosphite" 1s to be 
10 understood to be the monomethyl ester of the deoxyrlbose 

3 '-monophosphorous acid, the third valency being saturated 

* • 

by chlorine or a tertiary amino group, for example a aor- 
phollno radical. The yields 1n the Individual synthetic 
steps can 1n each case be determined, after the detrlty- 
1*5 latlon reaction (b), by spect rophotoaet ry by aeasurlhg the 
absorption of the d1 aethoxyt r1 ty I cation at a wavelength 
of 496 na. 

After the synthesis of the oligonucleotide 1s com- 
plete, the aethyl phosphate protective groups of the 

20 oligomer are eliminated using p-th1ocresol and trlethyl- 
aalne. The oligonucleotide Is then removed froa the 
carrier by treataent with ammonia for 3 hours. Treatment 
of the oligomers t#1th concentrated aamonla for 2 to 3 days 
quantitatively ellalnates the amino protective groups on 

25 tha bases. The crude product thus obtained 1s purified by 
high-pressure liquid chromatography (HPLO or by poly*- 

« « 

acrylamlde gel electrophoresis. 

The other structural units Ib-Xh of the gene, 
whose nucleotide sequences are derived froa PNA sequence 
30 XU, are also synthesized entirely correspondingly. 

b) Enxyaatlc linkage of the single-stranded oligonucleo- 
tides 

For the phosphorylation of the oligonucleotides 
at the 5* terminal end, 1.0 naol of each of ollgonucleo- 
35 tides lb and Xc are treated, with the addition of 10 naol 
of adenosine triphosphate, with 6 units of T4 polynucleo- 
tide kinase 1n 20 pi of 50 an trie HCl buffer (pH 7.6), 
10 an aagneslua chloride and 10 m« d1 th1 oth re1 tol (DTT) 

at 37°c for 30 nlnutes (C.C. RIcMrdson, Progress 1n 
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Huel. Add. Res. 2 <1972) 825). The enzyme Is 
inactivated by heating at 95°C for 5 alnutes. 

Oligonucleotides If and Ig are phosphory lated 

analogously. 

Oligonucleotides la and Id and ph osphory la t ed 
oligonucleotides lb and Ic are heated 1n 40 jil of 
50 aft trls HCl buffer at 95°C for 5 minutes, and they 
are then allowed to cool slowly to room temperature. To 
this mixture are added 20 mft magnesium chloride, 10 m« 
DTT and 1 ■« ATP / and reaction with 100 units of T4 DNA 
Ugase 1s allowed to continue at 25°C for 16 hours. 

Fragments Ie and Ih are linked with phosphory la t ed 

fragments If and Ig analogously. 

The product of the Ugase reaction of oligonucleo- 
tides la to Id (gene fragment A) 1s freeze-drled and Incu- 
bated 1n 100 jil of a buffer solution (150 mft NaCl, 10 mft 
tr1t HCl, pH 7.6, 6 mft magnesium chloride), which contains 
200 units of the endonuclease Bam HI, at 37°C for 3 hours. 

The product of the Ugase reaction of oligonucleo- 
tides Ie to Ih (gene fragment B) Is freeza-drled and Incu- 
bated In 100 jil of a buffer solution (100 mft trls HCl, 
pH 7.5, 50 mft NaCl, 5 mft magnesium chloride), which 
contains 200 units of endonuclease Eco HI, at 37°C for 
3 hours. After the enzyae digestion has been stopped by 
heating at ?5°C for 2 minutes, the cut gene fragments 
A and B are purified by gel elactrophoresl s on 15% 
polyacrylamlde gel (without addition of urea, 20 x 40 cm, 
2 mm thick), the marker substance uaed being pBR 322 
(supplied by Blolabs) cut with Hae III. After extraction 
of the OftA bands and purification on <*>Sephadex G50 and 
Sap Pak (supplied by Waters), gene fragments A and B are 
linked by "blunt end ligation". A synthetic regulation 
region corresponding to DNA sequence IXa which has, on 
the 5' end of the coding strand, an extension for attack 
by Bam HI and, on the 3' end of the coding strand, an 
.Ktenslon for attack by Eco RI 1s obtaln.d 1n this way. 

Example 2 : 

Hybrid plasmlds which contain tha synthetic con- 
trol region 



a) Incorporation of the control region 1n pUC 8 

The contrcla I ly available plasnld pUC 8 1s opened 
In known Banner using restriction endonucl eases Eco RI 
and Baa HI, and 1s separated, using 1X low-melt1ng agarose 
gels, from the oligonucleotides which have been cut out. 
The cut plasmld Is recovered after dissolving the gel at 
elevated temperature 1n accordance with the statements of 
the Manufacturers. 1 ^jg of the pUC 8 plasnld thus opened 
1s Ugated with 10 pg of the synthetic control region 
using T4 DNA Ugase at 14°C overnight. In this way, a 
modified pUC 8 plasnld having the Integrated control 
region 1s obtained. This hybrid plasnld Is represented 
In Figure 1, 1n which the control region 1s Indicated by 
SIP, which stands for "synthetic Idealized promoter". 

b) Transf orna 1 1 on 

The strain E • coll K 12 1s nade competent by 
treatnent with a 70 mN calcium chloride solution, and 
a suspension of the hybrid plasnld In 10 sH tr1s HCl 
buffer (pH 7.5), which 1s 70 mN In calcium chloride, 
1s added. The transformed strains are selected for ampl- 
cUUn resistance, and the Inserted sequence 1s verified 
by Ha xam-61 Ibert sequence analysis (A. Haxam and 
W. 61lbert, Proc. Natl. Acad. Scl. USA 74, 560-56* (1977)). 

Example 3; 

Expression plasmlds which contain the synthetic 
control region 

The commercially available plasmld pBR 322 1s 
opened using the restriction enzymes Bam HI and Sal X and 
la purified on a 1X agarose gel as described above. 
The synthetic control region 1s relsolated from the 
modified pUC 8 derivative by cutting with the enzymes 
Eco RX and Bam HI, purified on 10X polya cry lanlde gels, 
and recovered by subsequent el ect roelut 1 on. In an analo- 
gous manner, the *Y~1 nt erf eron gene 1s Isolated from approp- 
riate hybrid plasmlds by cutting with the restriction en- 
zymes Eco RX and Sal X and purifying on a 2% low-melting 
agarose gel. The hybrid plasmld containing the^(-1nter- 
feron gene which 1s used 1s the plasmld pMX 2 whose 
preparation 1$ dtscMbed 1n Semen Patent Application 
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P 34 09 966.2. However, the plasnld described in European 
Patent Application 0,095,350 can also be used. 

The linearized plasnld pBR 322, the synthetic control 
regions and th e /-I nt erf eron gene are then Ugated 1n known 
5 nanner, the plasnld as shown In Figure 2 being obtained. 

Exanpl e 4 ; 

Comparison of the activity of the synthetic con- 
trol region with that of the known potent tac control 
region. 

10 The plasnld pKK 177.3 known fron the literature 

(E. Anann et al., Gene 25., 167 (1983)) 1s cut using the 
restriction enzynes Eco RI and Sal I as described above, 
and the Y-1 nt erf eron gene Is Incorporated, by which neans 
the latter 1s coupled to the tac control region. The 
15 plasnld shown 1n Figure 3 1s obtained. 

To remove the tac control region fron pKK 177.3, 
the latter 1s digested with Ban HI and Sal I. The plasnid 
thus linearized Is Ugated with the Y-1 nt erf eron gene and 
the synthetic control region as described above. The hy- 
20 br1d plasnld shown In Figure 4 1s obtained. 

The hybrid plasnlds shown In Figure 3 and Figure 
4 are transforned Into E. coli K 12, and the bacteria are 
cultured In 2 YT nedlun (Killer, ExpeHnents 1n Molecular 
Genetics; 1972) until an optical density of 1 at 578 nn 
25 In the shake culture 1s reached. 0.1 nH IPT6 dsopropyl- 
^-thlogalectopyranoslde) 1s added to one portion of the 
bacterial culture/ and thus Induces the synthesis ofY- 
Interferon for 2 hours. 

The bacteria nre then renoved by cent r1 f uga 1 1 on 
30 end disrupted by treatnent with lysozyne, EDTA and ultra- 
sound (Hanletls et el., Molecular Cloning, Cold Spring 
Harbor, 1982). The Interferon titers of the bacterial 
lysates are found with a connerdally available radio- 
Innunoassay (Celltech) to be as follows: 
35 Plasnld shown 1n Figure 3: 1 x 10 5 units per nl 

Plasnld shown 1n Figure 4: 1.5 x 10 5 units per nl. 

Thus, In conparlson with the tac region, which 1s 
known to be excellent, the Y~1 nttrf eron yield obtained with 
the control region according to the Invention 1$ 50X higher. 
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ON A sequence 1 1 a 



Ban HI 



— la * * 

1 10 20 
5' AGCGGA?CC?AAA?AAA??CT?GACA? 
3' T C G C C ? A 0 G A T T T A T T T 'A A G A A C ? G ? A 
Ib— 



Aha III 



Ic 



30 40 
5' T : ? ! T A A A : H ? ? : G g : A : A A : G : 
3* AAAAATTTATTAAACCATATTACA 
Ib — P- m Id ; : — 



*• Ie ► 4»Ig 

50 60 70 
5' G I G G A A : T G I G A G C G G A A ! A A C A A ! : I 
3' CACCTTAACACTCGCCTTATTCTTAAA 
— » - If 



Zco P.I 



IS 



80 90 
5* CACAGAGGA?CTAGAAT?CACT 
5* GTGTCTCCTAGATCTTAAGTGA 
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THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS: 



1. A synthetic regulation region for the expression of 
heterologous genes in C. coli, which contains a promoter, a 
modified lac operator and a ribosomal binding site, having 
one or more of the following features: 

a) a spacer group of IS to 18 base-pairs is located 
between the -35 and the -10 regions, and 

b) a spacer group of 6 to 14 ba*v- -oai rs is located 
between the ribosomal binding site and the ATG 
start codon 

and optionally: 

c) the -35 region in the promoter has the following 
nucleotide sequence (coding strand) 

TTGACAT or CTTCACAT, 

d) the -10 region in the promoter has the following 
nucleotide sequence (coding strand) 

G TAT A AT . 

2. A regulation region as claimed in claim 1 wherein 
the spacer group between the -35 and the -10 regions has 17 
base-pairs and/or is rich in A and T. 

3. A regulation region as claimed in claim 1 or claim 
2, wherein the spacer group between the ribosomal binding 
site and the start codon has 10 base-pairs and/or is rich in 
A and T. 

4. A regulation region as claimed in one or more of 
the preceding claims, wherein the ribosomal binding site is 
rich in purines. 



5. A regulation region as claimed in one or more of 
the preceding claims, wherein the lac operator has the DNA 
sequence I or Ila as herein defined. 

6. A gene structure containing a regulation region as 
claimed in any one of claims 1 to 5. 

7. A hybrid vector containing a gene structure as 
claimed in claim 6. 

8. E. coli containing a hybrid vector as claimed in 
claim 7. 

9 # A polypeptide expressed by E. coli as claimed in 

claim 8. 

DATED this 16th day of October, 1989, 
HOECHST AKTIENGESELLSCHAFT 
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